Weak Supervision and Clustering-Based Sample Selection for Clinical Named Entity Recognition

نویسندگان

چکیده

Abstract One of the central tasks medical text analysis is to extract and structure meaningful information from plain-text clinical documents. Named Entity Recognition (NER) a sub-task extraction that involves identifying predefined entities unstructured free text. Notably, NER models require large amounts human-labeled data train, but human annotation costly laborious often requires training. Here, we aim overcome shortage manually annotated by introducing training scheme for uses an existing ontology assign weak labels provides enhanced domain-specific model adaptation with in-domain continual pretraining. Due limited resources, develop specific module collect more representative test dataset lake than random selection. To validate our framework, invite clinicians annotate set. In this way, construct two Finnish datasets based on records retrieved hospital’s evaluate effectiveness proposed methods. The code available at https://github.com/VRCMF/HAM-net.git .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active Sample Selection for Named Entity Transliteration

This paper introduces a new method for identifying named-entity (NE) transliterations within bilingual corpora. Current state-of-theart approaches usually require annotated data and relevant linguistic knowledge which may not be available for all languages. We show how to effectively train an accurate transliteration classifier using very little data, obtained automatically. To perform this tas...

متن کامل

Clique-Based Clustering for Improving Named Entity Recognition Systems

We propose a system which builds, in a semi-supervised manner, a resource that aims at helping a NER system to annotate corpus-specific named entities. This system is based on a distributional approach which uses syntactic dependencies for measuring similarities between named entities. The specificity of the presented method however, is to combine a clique-based approach and a clustering techni...

متن کامل

Learning Dictionaries for Named Entity Recognition using Minimal Supervision

This paper describes an approach for automatic construction of dictionaries for Named Entity Recognition (NER) using large amounts of unlabeled data and a few seed examples. We use Canonical Correlation Analysis (CCA) to obtain lower dimensional embeddings (representations) for candidate phrases and classify these phrases using a small number of labeled examples. Our method achieves 16.5% and 1...

متن کامل

Supervised Named Entity Recognition for Clinical Data

Clinical Named Entity Recognition is a part of Task 1b, organised by CLEF eHealth organisation in 2015. The aim is to automatically identify clinically relevant entities in medical text in French. A supervised learning approach has been used for training the tagger. For the purpose of training, Conditional Random Fields(CRF) has been used. An extensive set of features was used for training. Pre...

متن کامل

Named Entity Recognition in Persian Text using Deep Learning

Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2023

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-43427-3_27